fix(integrations): langchain add multimodal content transformation functions for images, audio, and files #5278

constantinius · 2026-01-05T19:16:37Z

Description

Add more support on message types for gen_ai.request.messages

Issues

Closes: https://linear.app/getsentry/issue/TET-1637/redact-images-langchain

…nctions for images, audio, and files

linear · 2026-01-05T19:16:41Z

TET-1637 Redact images: Langchain

sentry_sdk/integrations/langchain.py

…eport-binary-data

github-actions · 2026-01-13T12:51:14Z

Semver Impact of This PR

🟢 Patch (bug fixes)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).

New Features ✨

feat(ai): add parse_data_uri function to parse a data URI by constantinius in #5311
feat(asyncio): Add on-demand way to enable AsyncioIntegration by sentrivana in #5288
feat(openai-agents): Inject propagation headers for HostedMCPTool by alexander-alderman-webb in #5297
feat: Support array types for logs and metrics attributes by alexander-alderman-webb in #5314

Bug Fixes 🐛

fix(ai): redact message parts content of type blob by constantinius in #5243
fix(clickhouse): Guard against module shadowing by alexander-alderman-webb in #5250
fix(gql): Revert signature change of patched gql.Client.execute by alexander-alderman-webb in #5289
fix(grpc): Derive interception state from channel fields by alexander-alderman-webb in #5302

fix(integrations): langchain add multimodal content transformation functions for images, audio, and files by constantinius in #5278

fix(litellm): Guard against module shadowing by alexander-alderman-webb in #5249
fix(pure-eval): Guard against module shadowing by alexander-alderman-webb in #5252
fix(ray): Guard against module shadowing by alexander-alderman-webb in #5254
fix(threading): Handle channels shadowing by sentrivana in #5299
fix(typer): Guard against module shadowing by alexander-alderman-webb in #5253
fix: Stop suppressing exception chains in AI integrations by alexander-alderman-webb in #5309
fix: Send client reports for span recorder overflow by sentrivana in #5310

Documentation 📚

docs(metrics): Remove experimental notice by alexander-alderman-webb in #5304
docs: Update Python versions banner in README by sentrivana in #5287

Internal Changes 🔧

Release

ci(release): Bump Craft version to fix issues by BYK in #5305
ci(release): Switch from action-prepare-release to Craft by BYK in #5290

Other

chore(gen_ai): add auto-enablement for google genai by shellmayr in #5295
chore: add unlabeled trigger to changelog-preview by BYK in #5315
chore: Add type for metric units by sentrivana in #5312
ci: Update tox and handle generic classifiers by sentrivana in #5306

_{🤖 This preview updates automatically when you update the PR.}

sentry_sdk/integrations/langchain.py

…tive content formats

sentry_sdk/integrations/langchain.py

…eport-binary-data

…ats and use common function for data URI parsing

sentry_sdk/integrations/langchain.py

…AI messages Add transform_content_part() and transform_message_content() functions to standardize content part handling across all AI integrations. These functions transform various SDK-specific formats (OpenAI, Anthropic, Google, LangChain) into a unified format: - blob: base64-encoded binary data - uri: URL references (including file URIs) - file: file ID references Also adds get_modality_from_mime_type() helper to infer content modality (image/audio/video/document) from MIME types.

Replace local _transform_langchain_content_block and _get_modality_from_mime_type functions with the shared transform_content_part function. This removes ~170 lines of duplicated code.

Add dedicated transform functions for each AI SDK: - transform_openai_content_part() for OpenAI/LiteLLM image_url format - transform_anthropic_content_part() for Anthropic image/document format - transform_google_content_part() for Google GenAI inline_data/file_data - transform_generic_content_part() for LangChain-style generic format Refactor transform_content_part() to be a heuristic dispatcher that detects the format and delegates to the appropriate specific function. This allows integrations to use the specific function directly for better performance and clarity, while maintaining backward compatibility through the dispatcher for frameworks that can receive any format. Added 38 new unit tests for the SDK-specific functions.

constantinius added 5 commits December 17, 2025 10:45

fix(ai): redact message parts content of type blob

1f32952

fix(ai): skip non dict messages

795bcea

fix(ai): typing

a623e13

fix(ai): content items may not be dicts

3d3ce5b

fix(integrations): langchain add multimodal content transformation fu…

c606b66

…nctions for images, audio, and files

constantinius requested a review from a team as a code owner January 5, 2026 19:16

cursor bot reviewed Jan 5, 2026

View reviewed changes